unitary scalarization
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (11 more...)
InDefenseoftheUnitaryScalarization forDeepMulti-TaskLearning
While some workshowsthatmulti-task networkstrained viaunitary scalarization exhibit superior performance to independent per-task models [29, 35], others suggest the opposite [30, 54, 58]. However, SMTOs usually require access to per-task gradients either with respect to the shared parameters, or to the shared representation.
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (5 more...)
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (8 more...)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (11 more...)
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results.
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Kurin, Vitaly, De Palma, Alessandro, Kostrikov, Ilya, Whiteson, Shimon, Kumar, M. Pawan
Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We present a theoretical analysis suggesting that many specialized multi-task optimizers can be interpreted as forms of regularization. Moreover, we show that, when coupled with standard regularization and stabilization techniques from single-task learning, unitary scalarization matches or improves upon the performance of complex multi-task optimizers in both supervised and reinforcement learning settings. We believe our results call for a critical reevaluation of recent research in the area.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (8 more...)